Data available through: 2020-05-28
I am restricted the charts to show values on March 15, 2020 and after. This is when cases numbers started to rise and preventative measures started to increase dramatically.
These show the cumulative total of cases and deaths for each day - also denoting the current totals. These total values are important; however they are not helpful for figuring out whether the pandemic is slowing down or growing.
Looking at new cases each day can help us see if the pandemic is slowing. If the new cases per day is decreasing then the pandemic is slowing.
There can be a lot of variability in the daily case totals due to a variety of variable. One variable is the availability of tests; cases will go down if there is scarcity of tests and rise dramatically when more tests become available. One way to help get a better sense over the overall trend is by smoothing the data using a moving average.
The trends and raw data show a peak around mid-April and have been moving downward. This may be due to inacting stricter social distancing and lock-downs accross the country. There is also a cyclical nature to the daily new cases with counts often being lower on weekends and higher on weekdays.
COVID-19 is much deadlier than the common flu. One way to measure the impact is to look at the death percentage which is the total number of deaths divided by the total number of cases.
A big concern during April was that the death percentage was continually increasing; even when actual deaths per day were not increasing. Starting in early May the death percentage started to plateau around 6%.
Similar to new cases per day, deaths per day has been trending downward since about mid-April, although there are still spikes. These spikes may be do to reporting times - spikes seem cyclical on a weekly with lower counts generally occuring on the weekends and higher counts happening during the week.
The actual values for the pervious 14 days are detailed in the table below for those who are interested in actual values.
| Date | Total Cases | New Cases | Total Deaths | New Deaths | Death Percentage |
|---|---|---|---|---|---|
| Thu, May 28, 2020 | 1,714,870 | 24,946 | 100,478 | 1,208 | 5.859% |
| Wed, May 27, 2020 | 1,689,924 | 20,063 | 99,270 | 1,395 | 5.874% |
| Tue, May 26, 2020 | 1,669,861 | 18,015 | 97,875 | 640 | 5.861% |
| Mon, May 25, 2020 | 1,651,846 | 19,070 | 97,235 | 695 | 5.886% |
| Sun, May 24, 2020 | 1,632,776 | 20,645 | 96,540 | 639 | 5.913% |
| Sat, May 23, 2020 | 1,612,131 | 22,277 | 95,901 | 1,101 | 5.949% |
| Fri, May 22, 2020 | 1,589,854 | 23,357 | 94,800 | 1,197 | 5.963% |
| Thu, May 21, 2020 | 1,566,497 | 25,954 | 93,603 | 1,332 | 5.975% |
| Wed, May 20, 2020 | 1,540,543 | 22,069 | 92,271 | 1,495 | 5.990% |
| Tue, May 19, 2020 | 1,518,474 | 20,607 | 90,776 | 1,536 | 5.978% |
| Mon, May 18, 2020 | 1,497,867 | 21,621 | 89,240 | 1,018 | 5.958% |
| Sun, May 17, 2020 | 1,476,246 | 18,518 | 88,222 | 762 | 5.976% |
| Sat, May 16, 2020 | 1,457,728 | 24,512 | 87,460 | 1,081 | 6.000% |
| Fri, May 15, 2020 | 1,433,216 | 24,906 | 86,379 | 1,458 | 6.027% |
Data available through: 2020-05-28
I am restricted the charts to show values on March 15, 2020 and after. This is when cases numbers started to rise and preventative measures started to increase dramatically.
One important calculation is the growth factor, as outlined in 3Blue1Brown’s youtube video on exponential growth . The growth rate is calculated as follows:
\[ \text{Growth Factor} = \frac{ \text{New-Cases}_N}{\text{New-Cases}_{N-1}} \] where \(N\) is a given day. Essentialy this is taking the amount new cases today and dividing them by the amount of new cases yesterday.
The growth factor can be very helpful in determining if the pandemic is slowing. If the growth factor is less than 1, this means that the amount of new cases today is less than yesterday. Once there are multiple days with a growth factor less than 1 - this is a strong sign that the pandemic is slowing down.
Adjustment to Growth Factor
What if there were 0 cases yesterday? This would make the growth factor undefined (or \(\infty\) according to R). This makes it difficult to look at trends. I have adjusted the growth factor so that if the pervious day had 0 cases; the current day’s growth factor is equal to the number of new cases:
\[ \text{Growth Factor} = \begin{cases} \frac{ \text{New-Cases}_N}{\text{New-Cases}_{N-1}} & \text{if } \text{New-Cases}_{N-1} \neq 0 \\[1ex] \text{New-Cases}_N & \text{if } \text{New-Cases}_{N-1} = 0 \end{cases} \] When I made this adjustment I was thinking about the early stages of the pandemic when the number of cases per day are 0, 1, or 2. However, given the test scarcity and reporting times there are situations in counties or states where there were 0 cases one day and then hundred or thousands the next day. This is why some of growth factors are much larger.
Similar to the new cases per day; there can be a lot of variability in growth rates. In order to get a better sense of the trend I am showing a 14-day moving average of the growth factor.
The growth rate shows a different trend than new cases. Here, the growth rate has stayed around 1 since mid-April. Compare that to the new cases plot on the Overview tab, which shows a downward trends. The the growth factor remaining around 1 due may be due to the cyclical nature of new cases reporting (high during the week, low during the weekends) - but it could also be showing that although the decrease in new cases is positive we are not out of the woods yet.
What if instead of looking at the growth rate and average of the growth rate - we calculated the growth rate on the average of new cases? I’m going to call this \(\text{GF}_{14}\) to represent that it’s the growth rate based on the 14-day moving average of new cases.
Looking at the new cases 14-day moving average, it is clearly smoother than the raw new cases.
Using the new cases 14-day moving average, calculate the \(\text{GF}_{14}\):
\[ \text{GF}_{14} = \frac{ \frac{1}{14}\sum_{N-14}^N \text{New-Cases}_i }{ \frac{1}{14}\sum_{N-15}^{N - 1} \text{New-Cases}_i } \]
I am showing both the growth factor 14-day moving average and the \(\text{GF}_{14}\) for comparison.
Turns out it doesn’t make a significant difference! Even when looking the growth rate based on the 14-day moving average of new cases (smoothing over the cyclical nature of new cases reporting) - results is a growth factor, \(\text{GF}_{14}\) that is not deviating from 1.
| Growth Rate | Growth Factor 14-day MA | GF_14 | |
|---|---|---|---|
| All dates | 0.7217919 | 0.3266181 | 0.2371494 |
| Since 2020-03-15 | 0.2062289 | 0.1600012 | 0.1283394 |
| Last 14 days | 0.1329431 | 0.0078870 | 0.0081749 |
Looking at the standard deviations, it’s unsurpising that the \(\text{GF}_{14}\) has the least amount of deviation, followed by the Growth Rate 14-day Moiving average, and the raw Growth Rate has the most deviation.
Since there isn’t a large difference between \(\text{GF}_{14}\) and Growth Rate 14-Day Moving Average, I’m going to use the Growth Rate 14-Day Moving Average instead of \(\text{GF}_{14}\). When possible, I want to keep the calculations as simple and close to the raw data as possible.
| Date | Total Cases | New Cases | New Cases 14-Day MA | Growth Factor | Growth Factor 14-day MA |
|---|---|---|---|---|---|
| Thu, May 28, 2020 | 1,714,870 | 24,946 | 21,897 | 1.24 | 1 |
| Wed, May 27, 2020 | 1,689,924 | 20,063 | 22,029 | 1.11 | 1.01 |
| Tue, May 26, 2020 | 1,669,861 | 18,015 | 22,068 | 0.94 | 1 |
| Mon, May 25, 2020 | 1,651,846 | 19,070 | 22,318 | 0.92 | 1.01 |
| Sun, May 24, 2020 | 1,632,776 | 20,645 | 22,225 | 0.93 | 1.01 |
| Sat, May 23, 2020 | 1,612,131 | 22,277 | 22,187 | 0.95 | 1 |
| Fri, May 22, 2020 | 1,589,854 | 23,357 | 22,424 | 0.9 | 1 |
| Thu, May 21, 2020 | 1,566,497 | 25,954 | 22,711 | 1.18 | 1.01 |
| Wed, May 20, 2020 | 1,540,543 | 22,069 | 22,821 | 1.07 | 1.01 |
| Tue, May 19, 2020 | 1,518,474 | 20,607 | 22,935 | 0.95 | 1 |
| Mon, May 18, 2020 | 1,497,867 | 21,621 | 23,193 | 1.17 | 1.01 |
| Sun, May 17, 2020 | 1,476,246 | 18,518 | 23,196 | 0.76 | 0.99 |
| Sat, May 16, 2020 | 1,457,728 | 24,512 | 23,671 | 0.98 | 1 |
| Fri, May 15, 2020 | 1,433,216 | 24,906 | 24,061 | 0.93 | 0.99 |
Since most of the COVID-19 measures are enacted by the state; it may be more helpful for the individual to see the growth factor for the last 14 days in a specific state.
Build your own growth factor plot for a given state and time period by using a shiny app. The app ca can be access through this link, mareichler.shinyapps.io/state_gf/, and is also embedded below:
Data available through: 2020-05-28
This plot is a quick overview of the last two weeks for each state. Which states have seen an average decrease in daily cases for 14 days and may be ready to start the re-opening process? Which states are not ready to reopen and finally which states may need additional social-distancing and lock-down measures?
As mentioned in the Growth Factor tab, some states will have very large growth rates. This is mostly due to having 0 cases on one day and then a large amount the next day - likely caused by test availability or reporting.
Data available through: 2020-05-28
It is also interesting to look at the growth factor on the county level. Although since most lock-down measures are enacted at the state level, this plot may not be that helpful. In addition there are 11,597 cases that have been not assigned to a county. Although that only accounts for 0.676% of the total cases; it could have a big impact on a given state or county.
Data available through: 2020-05-28
These gifs are not as helpful as I hoped they would be and do not do a great job of showing trends. This may be to the large fluctuations in growth factor and so getting a clear trend is difficult. I’m keeping theme here because they look nice and they took me a long time to create; even though they don’t provide much intellectual value.
This data is downloaded from USA Facts CDC. I use two of the three spreadsheets, one with total cases and one with total deaths - both broken down by state and county. This data requires additional formatting, calculation, and aggregation. USA Facts gets data by county on a daily basis, this is totaled to get values for each day for the entire US.
The American CDC links to USA Facts under Cases & Death by County, which is how I found the data source.